Skip to content

fix(gardener/digest): grow budget + filter .gardener-tree-cache/drift noise + surface truncation#344

Open
serenakeyitan wants to merge 1 commit intomainfrom
fix/digest-noise-and-budget
Open

fix(gardener/digest): grow budget + filter .gardener-tree-cache/drift noise + surface truncation#344
serenakeyitan wants to merge 1 commit intomainfrom
fix/digest-noise-and-budget

Conversation

@serenakeyitan
Copy link
Copy Markdown
Contributor

Addresses #343 items 1 & 2. Leaves items 3–5 (leaf inclusion, soft_links, task-aware models, docs) for follow-ups.

Summary

  • Digest budget 100KB → 500KB (models have 200K-token / ~800KB windows; old cap was silently truncating).

  • SKIP_DIRS gains .gardener-tree-cache (gardener's own tree snapshot — stale duplicates).

  • Auto-generated drift/ placeholder nodes (summary: "Auto-generated intermediate node for sync proposals") are filtered before they eat budget.

  • New collectTreeDigestDetailed() reports skippedAsNoise / truncatedCount / budgetExhausted. Both classifier factories (claude-cli, anthropic-api) now pass these through emitDigestDiagnostics() to a write sink (defaults to process.stderr), so previously-silent truncation becomes visible:

    gardener: tree digest filtered 5 drift placeholder node(s) (auto-generated by prior sync)
    gardener: tree digest budget exhausted — 17 node(s) dropped. Verdict may miss relevant tree context. …
    

E2E results on paperclipai/paperclip + paperclip-tree

before after delta
digest nodes kept 138 64 −74 (−54%)
digest bytes 24,794 11,188 −55%
.gardener-tree-cache in out eliminated
drift/ placeholders in out 5 filtered
budget headroom 100KB 500KB

Ran gardener comment --pr 4367 --repo paperclipai/paperclip against the built CLI:

  • Diagnostic line fires ("filtered 5 drift placeholder node(s)").
  • Classifier completes against the cleaner digest.
  • Verdict sharpens NEEDS_REVIEW → NEW_TERRITORY — the more precise call for a PR introducing a governance behavior no existing tree node covers.

Tests

  • New tests/gardener/tree-digest.test.ts: 10 tests covering the SKIP_DIRS addition, drift filter (including the anchored-regex guard against false positives), accounting invariants, and emitDigestDiagnostics output in all four states.
  • All 380 existing gardener tests pass, including the 11 claude-cli classifier tests and 19 anthropic classifier tests that exercise this call path.
  • Typecheck clean; build clean.

Out of scope (explicit deferrals)

Intentionally not in this PR, from #343:

  • Leaf-file inclusion when a PR diff's paths overlap a node domain's path prefix. Needs heuristic design review.
  • soft_links transitive inclusion in the digest. Same.
  • Task-aware default model (haiku for `comment`, sonnet-4-6 for `sync` / `draft-node`). Needs per-command plumbing + benchmark verification.
  • GARDENER_CLASSIFIER_MODEL documented in onboarding.md Step 6. Separate docs PR.

Those are better reviewed independently.

Closes / refs

Refs #343.

… + drift placeholders; warn on budget exhaustion

Addresses #343 items 1 & 2 (budget growth + noise filtering). Leaves
items 3–5 (leaf inclusion, soft_links, task-aware model selection,
onboarding.md doc) for follow-up PRs since they involve more design.

## What

Three focused changes to `tree-digest.ts`:

1. **DIGEST_BUDGET_BYTES: 100KB → 500KB.** Today's Claude 4.5/4.6/4.7
   models give us a 200K-token (~800KB) context window; the old
   budget was severely over-conservative and silently truncating
   nodes from the classifier's view on larger trees. 500KB covers
   ~2700 NODE.md entries with room for PR body + diff + response.
2. **SKIP_DIRS gains `.gardener-tree-cache`.** Gardener's own
   per-sweep snapshot of the tree — stale duplicates of real NODE.md
   files — should never enter the digest. On the paperclip-tree
   repo this alone cut the digest node count roughly in half.
3. **Drop `drift/` auto-generated placeholder nodes by summary
   regex.** These are scaffolding nodes gardener writes to give
   proposal PRs a valid parent chain (summary: "Auto-generated
   intermediate node for sync proposals"). They carry zero decision
   signal — safe to filter before they eat budget.

Plus one observability change:

4. **`collectTreeDigestDetailed()` reports `skippedAsNoise`,
   `truncatedCount`, and `budgetExhausted`.** The two classifier
   factories (`claude-cli`, `anthropic-api`) now call the detailed
   variant and emit `emitDigestDiagnostics()` output via an
   injectable `write` sink (defaults to `process.stderr`). Before:
   budget exhaustion was silent → verdict could cite wrong nodes
   as "closest match" with no trace. Now:

       gardener: tree digest filtered 5 drift placeholder node(s) (auto-generated by prior sync)
       gardener: tree digest budget exhausted — 17 node(s) dropped. Verdict may miss relevant tree context. Consider pruning the tree or raising DIGEST_BUDGET_BYTES.

## E2E results on paperclipai/paperclip + paperclip-tree

Before/after on the real tree (v0.3.2 main → this branch):

|                          | before | after  | delta            |
| ------------------------ | ------ | ------ | ---------------- |
| digest nodes kept        | 138    | 64     | −74 (−54%)       |
| digest bytes             | 24,794 | 11,188 | −55%             |
| `.gardener-tree-cache`   | in     | out    | eliminated       |
| `drift/` placeholders    | in     | out    | 5 filtered       |
| budget headroom          | 100KB  | 500KB  | 5×               |

Ran `gardener comment --pr 4367 --repo paperclipai/paperclip` end-to-end
against the built CLI:

- Diagnostic line fired (`filtered 5 drift placeholder node(s)`).
- Classifier completed against the cleaner digest.
- Verdict sharpened from NEEDS_REVIEW (v0.3.2) → NEW_TERRITORY,
  which is the more precise call: PR #4367 adds a queue-sweep
  governance behavior no existing tree node covers. The cleaner
  digest + removal of stale cache duplicates let the classifier
  confidently place the PR in "new territory" instead of hedging.

## Tests

- New `tests/gardener/tree-digest.test.ts` (10 tests):
  - `.gardener-tree-cache` is skipped.
  - Drift placeholder summary is filtered; same-word mentions in real
    summaries pass through (anchored regex).
  - Detailed-result accounting stays self-consistent.
  - `emitDigestDiagnostics` emits the right lines (noise only,
    budget only, both, or silent on a healthy digest).
  - `formatDigest` bullet format unchanged.
- All existing gardener tests pass (380/380 in `tests/gardener/`,
  including the 11 claude-cli classifier tests and 19 anthropic
  classifier tests).
- Typecheck clean.

## Scope

Deliberately does not ship items 3–5 of #343:

- Leaf-file inclusion when PR diff paths overlap a node domain.
- `soft_links` transitive inclusion.
- Task-aware default model (haiku for comment, sonnet-4-6 for
  sync/draft-node).
- `GARDENER_CLASSIFIER_MODEL` documented in onboarding.md.

Those involve more design (heuristic tuning, per-command plumbing,
doc edits) and are better reviewed separately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant